Search CORE

4 research outputs found

Detection and identification of elliptical structure arrangements in images: theory and algorithms

Author: Pătrăucean Viorica
Publication venue: INPT
Publication date: 19/01/2012
Field of study

Cette thèse porte sur différentes problématiques liées à la détection, l'ajustement et l'identification de structures elliptiques en images. Nous plaçons la détection de primitives géométriques dans le cadre statistique des méthodes a contrario afin d'obtenir un détecteur de segments de droites et d'arcs circulaires/elliptiques sans paramètres et capable de contrôler le nombre de fausses détections. Pour améliorer la précision des primitives détectées, une technique analytique simple d'ajustement de coniques est proposée ; elle combine la distance algébrique et l'orientation du gradient. L'identification d'une configuration de cercles coplanaires en images par une signature discriminante demande normalement la rectification Euclidienne du plan contenant les cercles. Nous proposons une technique efficace de calcul de la signature qui s'affranchit de l'étape de rectification ; elle est fondée exclusivement sur des propriétés invariantes du plan projectif, devenant elle même projectivement invariante. ABSTRACT : This thesis deals with different aspects concerning the detection, fitting, and identification of elliptical features in digital images. We put the geometric feature detection in the a contrario statistical framework in order to obtain a combined parameter-free line segment, circular/elliptical arc detector, which controls the number of false detections. To improve the accuracy of the detected features, especially in cases of occluded circles/ellipses, a simple closed-form technique for conic fitting is introduced, which merges efficiently the algebraic distance with the gradient orientation. Identifying a configuration of coplanar circles in images through a discriminant signature usually requires the Euclidean reconstruction of the plane containing the circles. We propose an efficient signature computation method that bypasses the Euclidean reconstruction; it relies exclusively on invariant properties of the projective plane, being thus itself invariant under perspective

Open Archive Toulouse Archive Ouverte

Institut National Polytechnique de Toulouse (Theses)

Détection de primitives linéaires et circulaires par une approche a contrario

Author: Conter Jean
Gurdjos Pierre
Morin Géraldine
Pătrăucean Viorica
Publication venue: HAL CCSD
Publication date: 01/06/2011
Field of study

National audienceLow-level image understanding requires the use of different detectors for basic primitives, such as line segments or circular arcs. Most of the existent detectors deal with problems that have been (and still are) extensively studied like parameter tunning, control of number of false detections or execution time. In this paper, we focus on detecting simultaneously lines and circles in an image, while controlling the number of false detections and without any need of parameter tunning. We present an algorithm which extends the Line Segment Detector (LSD) for circles, both being based on the a contrario approach. Due to the fact that the proposed detector targets two different types of primitives, the a contrario validation is used as a criterion for model selection, which is a novelty in the a contrario-based works. In addition, we propose a new algebraic method for estimating a circle, which benefits equally from the direction of the gradient of the contour points, and not only from their position.La compréhension bas niveau d'une image exige l'usage des différents détecteurs de primitives de base, telles que des segments de droite ou arcs de cercles. La plupart des détecteurs existants se confrontent à des problèmes qui ont été (et sont toujours) considérablement étudiés comme le réglage de paramètres, le contrôle du nombre de fausses détections ou le temps d'exécution. Dans cet article, nous nous intéressons à la détection à la fois des droites et des cercles dans une image, tout en contrôlant le nombre de fausses détections et sans réglage particulier de paramètres. Nous présentons un détecteur qui étend l'algorithme LSD (Line Segment Detector) aux cercles, les deux étant fondés sur une approche a contrario. Du fait que le détecteur proposé vise deux types différents de primitives, la validation a contrario est utilisée comme méthode de sélection du modèle, ce qui représente une nouveauté dans les travaux fondés sur l'approche a contrario. De plus, nous proposons une nouvelle méthode d'estimation algébrique d'un cercle, qui profite également de la direction du gradient des points contour, et non pas uniquement de la position de ceux-ci

Scientific Publications of the University of Toulouse II Le Mirail

Perception Test: A Diagnostic Benchmark for Multimodal Video Models

Author: Aytar Yusuf
Banarse Dylan
Carreira João
Continente Adrià Recasens
Damen Dima
Doersch Carl
Frechette Alex
Gupta Ankush
Heyward Joseph
Klimczak Hanna
Koppula Skanda
Koster Raphael
Malinowski Mateusz
Markeeva Larisa
Matejovicova Tatiana
Miech Antoine
Osindero Simon
Pătrăucean Viorica
Smaira Lucas
Sulsky Yury
Winkler Stephanie
Yang Yi
Zhang Junlin
Zisserman Andrew
Publication venue
Publication date: 23/05/2023
Field of study

We propose a novel multimodal video benchmark - the Perception Test - to evaluate the perception and reasoning skills of pre-trained multimodal models (e.g. Flamingo, BEiT-3, or GPT-4). Compared to existing benchmarks that focus on computational tasks (e.g. classification, detection or tracking), the Perception Test focuses on skills (Memory, Abstraction, Physics, Semantics) and types of reasoning (descriptive, explanatory, predictive, counterfactual) across video, audio, and text modalities, to provide a comprehensive and efficient evaluation tool. The benchmark probes pre-trained models for their transfer capabilities, in a zero-shot / few-shot or limited finetuning regime. For these purposes, the Perception Test introduces 11.6k real-world videos, 23s average length, designed to show perceptually interesting situations, filmed by around 100 participants worldwide. The videos are densely annotated with six types of labels (multiple-choice and grounded video question-answers, object and point tracks, temporal action and sound segments), enabling both language and non-language evaluations. The fine-tuning and validation splits of the benchmark are publicly available (CC-BY license), in addition to a challenge server with a held-out test split. Human baseline results compared to state-of-the-art video QA models show a significant gap in performance (91.4% vs 43.6%), suggesting that there is significant room for improvement in multimodal video understanding. Dataset, baselines code, and challenge server are available at https://github.com/deepmind/perception_testComment: 25 pages, 11 figure

arXiv.org e-Print Archive

Triangulation Algorithms for Generating As-Is Floor Plans

Author: Christine Chevrier
De Berg
Hossam ElGindy
Ioannis Brilakis
Jaehoon Jung
José Pinto Duarte
Mario Carpo
Maurice Murphy
Michael R Garey
Salman Khalili-Araghi
Viorica Pătrăucean
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref